Improving Mask Learning Based Speech Enhancement System with Restoration Layers and Residual Connection

نویسندگان

Zhuo Chen

Yan Huang

Jinyu Li

Yifan Gong

چکیده

For single-channel speech enhancement, mask learning based approach through neural network has been shown to outperform the feature mapping approach, and to be effective as a pre-processor for automatic speech recognition. However, its assumption that the mixture and clean reference must have the correspondent scale doesn’t hold in data collected from real world, and thus leads to significant performance degradation on parallel recorded data. In this paper, we first extend the mask learning based speech enhancement by integrating two types of restoration layer to address the scale mismatch problem. We further propose a novel residual learning based speech enhancement model via adding different shortcut connections to a feature mapping network. We show such a structure can benefit from both the mask learning and the feature mapping. We evaluate the proposed speech enhancement models on CHiME 3 data. Without retraining the acoustic model, the best bidirection LSTM with residue connections yields 24.90% relative WER reduction on real data and 34.57% WER on simulated data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

An Iterative Phase Recovery Framework with Phase Mask for Spectral Mapping with an Application to Speech Enhancement

We propose an iterative phase recovery framework to improve spectral mapping with an application to improving the performance of state-of-the-art speech enhancement systems using magnitude-based spectral mapping with deep neural networks (DNNs). We further propose to use an estimated time-frequency mask to reduce sign uncertainty in the overlap-add waveform reconstruction algorithm. In a series...

متن کامل

Student-Teacher Learning for BLSTM Mask-based Speech Enhancement

Spectral mask estimation using bidirectional long short-term memory (BLSTM) neural networks has been widely used in various speech enhancement applications, and it has achieved great success when it is applied to multichannel enhancement techniques with a mask-based beamformer. However, when these masks are used for single channel speech enhancement they severely distort the speech signal and m...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Improving Speech Intelligibility in Binaural Hearing Aids by Estimating a Time-Frequency Mask with a Weighted Least Squares Classifier

An efficient algorithm for speech enhancement in binaural hearing aids is proposed. The algorithm is based on the estimation of a time-frequency mask using supervised machine learning. The standard least-squares linear classifier is reformulated to optimize a metric related to speech/noise separation. The method is energy-efficient in two ways: the computational complexity is limited and the wi...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Improving Mask Learning Based Speech Enhancement System with Restoration Layers and Residual Connection

نویسندگان

چکیده

منابع مشابه

Speech Enhancement using Adaptive Data-Based Dictionary Learning

An Iterative Phase Recovery Framework with Phase Mask for Spectral Mapping with an Application to Speech Enhancement

Student-Teacher Learning for BLSTM Mask-based Speech Enhancement

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Improving Speech Intelligibility in Binaural Hearing Aids by Estimating a Time-Frequency Mask with a Weighted Least Squares Classifier

عنوان ژورنال:

اشتراک گذاری